m_ht_wt <- linear_reg() |>
set_engine("lm") |>
fit(Height_in ~ Width_in, data = pp)08-effective-communication
Effective Communication
Q&A
Q: for the last part of lecture 07, I tried glance(m_ht_wt) but it didn’t work because m_ht_wt doesn’t exist. Is “m_ht_wt” supposed be a model?
A: Yup, this model was the height by weight model:
Q: I was not too sure what was going on when talking about the relationship between painting height and school.
A: I don’t think you were the only one confused! Briefly here (and I’m happy to chat more before/after class and in OH), we were looking to determine/quantify the relationship between the size (height) of a painting and the school from which the painting originated. This was an example of having more than two categories for a categorical (factor) predictor. The important points were undersatnding that each level is compared to the baseline and the linear model that results from multiple categories. Part 4 of the lab gets into this a bit more too. Definitely follow up if you’re unsure after doing that part of the lab!
Q: How do you calculate the linear regression model when you have non-numeric values? For example, on lab 04, when it asks to calculate the linear regression model by gender, the gender appears only as male and female. Suppose male is 1 and female is 0 (interpreted by the function), then male linear regression model is y =ax + 1?
A: Close! the “1” would be plugged in as the value of x (in what you suggested)m not for the intercept. So the function would be \(y=\beta_1*1 + \beta_0\)
Course Announcements
- Lab04 due Friday
- HW02 due Monday
- Practice Midterms Now Available
- answers posted next week
- Midterm Exam
- will cover material through “Multiple Linear Regression”
- will be released/posted next Friday after lab
- will be due Monday Nov 6th at 11:59 PM
- will be an Rmd document and submitted via GitHub (like everything so far)
- will be completed individually (open Notes; open Internet)
Agenda
- Communicating for your audience
- Oral Communication
- Written Communication
- Visual Communication
Suggested Reading
- Bookdown Section 2.6 R Code Chunks & inline R code
- Bookdown Chapter 3: Documents
- Will Chase’s rstudio::conf2020 talk: “The Glamour of Graphics” [slides] [video]
Consider your audience
What does this mean?
❓ What does it mean to “consider your audience?”
Simply: You do the work so they don’t have to.
…also the aesthetic-usability effect exists.
What’s the right level?
General Audience
✔ background
🚫 limit technical details
🎉 emphasize take-home
Technical Audience
⬇ limit background
💻 all-the-details
🎉 emphasize take-home
Considerations
- Platform: written? oral?
- Setting: informal? formal?
- Timing: never go over your time limit!
Storytelling
- Stories have a beginning, a middle, and an end.
- Stories do not need every detail of what you’ve tried
- Reports and presentations should tell a story
- Planning out your report/presentation can help
- Hold the audience’s attention with what needs to be said; do so effectively
- Tell your audience why they should care; why it matters
- You should explain your choices and the “why”
Choose informative titles
On presentations: Balance b/w short and informative (goal: concise)
Avoid: “Analyzing NHANES”
Better: “Data from the NHANES study shows that diet is related to overall health”
On visualizations: emphasize the take-home! (what’s learned or what action to take)
Avoid: “Boxplot of gender”
Better: “Twice as many females as males included for analysis”
Avoid: “Tickets vs. Time”
Better: “Staff unable to respond to incoming tickets; need to hire 2 FTEs”
Effective Oral Communication
Brainstorm: Advice You’ve Been Given?
Student responses
Student responses will be added to notes after class…
Presentations are for listening
- Advantage: words to explain out loud what you’re showing
- You are presenting for the person in the back of the room.
To accomplish:
don’t read directly off slides
repetition is ok: tell what you’re going to tell them, tell them, tell them what you told them
use animation to build your story (not to distract)
introduce your axes
text/labels larger
watch your speech speed
practice!
For Example: A Happy Ending for (almost) everyone in Little Red Riding Hood
- Red Riding Hood (RRH) has to walk 0.54 mi from Point A (home) to Point B (Grandma’s)
- RRH meets Wolf who (1) runs ahead to Grandma’s, (2) eats her, and (3) dresses in her clothes
- RRH arrives at Grandmas at 2PM, asks her three questions
- Identified problem: after third question, Wolf eats RRH
- Solution: vendor (Woodsman) employs tool (ax)
- Expected outcome: Grandma and RRH alive, wolf is not
Little Red Riding Hood
Effective Written Communication
Brainstorm: Advice You’ve Been Given?
Student responses
Student responses will be added to notes after class…
Benefits of written communciation
Your audience has time to process…but the explanation has to be there!
Visually: more on a single visualization
Yes, often there are different visualizations for reports/papers than for presentations/lectures.
When you have time to digest (read)
❓ What makes this an effective visualization for a written communication?”
Source: Storytelling wtih data by cole nussbaumer knaflic
Written Explanations
- Visualizations should be explained/interpreted
- Models should be explained
- should be clear what question is being answered
- what conclusions is being drawn
- and what numbers were used to draw that conclusion
Data Science Reports in .Rmd
- As concise as possible
- Necessary details (for your audience); nothing more
- Be sure that the knit output contains what you intended (plots displayed; headers etc.)
- …and does NOT display stuff that doesn’t need to be there (messages/warnings suppressed, brainstorming, etc.)
- Typical Sections: Introduction/Background, Setup, Data, Analysis, Conclusion, References
Controlling HTML document settings
- Table of Contents
---
title: "Document Title"
output:
html_document:
toc: true
toc_float: true
---
- Theme
---
title: "Document Title"
output:
html_document:
theme: united
highlight: tango
---
- Figure Options
---
title: "Document Title"
output:
html_document:
fig_width: 7
fig_height: 6
fig_caption: true
---
- Code Folding
---
title: "Document Title"
output:
html_document:
code_folding: hide
---
Controlling code chunk output
- Specified in the curly braces, separated by commas
eval: whether to execute the code chunkecho: whether to include the code in the outputwarning,message, anderror: whether to show warnings, messages, or errors in the knit documentfig.widthandfig.height: control the width/height of plots
- Controlling for the whole document:
knitr::opts_chunk$set(fig.width = 8, collapse = TRUE)
Editing & Proofreading
- Did you end up telling a story?
- Things missing?
- Things to delete?
- Do not fall in love with your words/code/plots
- Do spell check
- Do read it over before sending/presenting/submitting
Aside: Citing Sources
When are citations needed?
“We will be doing our analysis using two different data sets created by two different groups: Donohue and Mustard + Lott, or simply Lott”
“What turned from the idea of carrying firearms to protect oneself from enemies such as the British monarchy and the unknown frontier of North America has now become a nationwide issue.”
“Right to Carry Laws refer to laws that specify how citizens are allowed to carry concealed handguns when they’re away from home without a permit”
“In this case study, we are examining the relationship between unemployment rate, poverty rate, police staffing, and violent crime rate.”
“In the United States, the second amendment permits the right to bear arms, and this law has not been changed since its creation in 1791.”
“The Right to Carry Laws (RTC) is defined as”a law that specifies if and how citizens are allowed to have a firearm on their person or nearby in public.””
Reminder: You do NOT get docked points for citing others’ work. You can be at risk of AI Violation if you don’t. When in doubt, give credit.
Footnotes in .Rmd
How to specify a footnote in text:
Here is some body text.[^1]
How to include the footnote’s reference:
[^1]: This footnote will appear at the bottom of the page.
Note: .bib files can be included with BibTeX references using the bibliography parameter in your YAML
Effective Visual Communication
Brainstorm: Advice You’ve Been Given?
Student Responses
Student responses will be added to notes after class…
The Glamour of Graphics
- builds on top of the grammar (components) of a graphic
- considerations for the design of a graphic
- color, typography, layout
- going from accurate to 😍effective
These ideas and slides are all modified from Will Chase’s rstudio::conf2020 slides/talk
Left-align titles at top-left
😬 Accurate
ggplot(penguins, aes(x = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))😍 Effective
ggplot(penguins, aes(x = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
plot.title.position = "plot")Avoid head-tilting
😬 Accurate
ggplot(penguins, aes(x = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
plot.title.position = "plot")😍 Effective
ggplot(penguins, aes(y = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme(plot.title.position = "plot")Borders & Backgrounds: 👎
😬 Accurate
ggplot(penguins, aes(y = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_bw() +
theme(plot.title.position = "plot") 😍 Effective
ggplot(penguins, aes(y = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_minimal() +
theme(plot.title.position = "plot") Organize & Remove/Lighten as much as possible
😬 Accurate
ggplot(penguins, aes(y = species, fill = species)) +
geom_bar() +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_minimal() +
theme(plot.title.position = "plot") 😍 Effective
ggplot(penguins, aes(y = fct_rev(fct_infreq(species)), fill = species)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), hjust = 1.5, color = "white", size = 6) +
scale_x_continuous(expand = c(0, 0)) +
scale_fill_manual(values = c("#454545", rep("#adadad", 2))) +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_minimal(base_size = 18) +
theme(axis.text.x = element_blank(),
plot.title.position = "plot",
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.title = element_blank()) Legends suck
😬 Accurate
ggplot(penguins, aes(y = fct_rev(fct_infreq(species)), fill = species)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), hjust = 1.5, color = "white", size = 6) +
scale_x_continuous(expand = c(0, 0)) +
scale_fill_manual(values = c("#454545", rep("#adadad", 2))) +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_minimal(base_size = 18) +
theme(axis.text.x = element_blank(),
plot.title.position = "plot",
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.title = element_blank())😍 Effective
ggplot(penguins, aes(y = fct_rev(fct_infreq(species)), fill = species)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), hjust = 1.5, color = "white", size = 7) +
scale_x_continuous(expand = c(0, 0)) +
scale_fill_manual(values = c("#454545", rep("#adadad", 2))) +
labs(title = "Adelie Penguins are the most common in Antarctica",
subtitle = "Frequency of each penguin species studied near Palmer Station, Antarctica") +
theme_minimal(base_size = 20) +
theme(axis.text.x = element_blank(),
plot.title.position = "plot",
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.title = element_blank(),
legend.position = "none")Additional Guidance
- White space is like garlic - take the amount you need and triple it
- Fonts Matter
- Use Color Effectively